Tempo Induction Using Filterbank Analysis and Tonal Features

نویسندگان

Aggelos Gkiokas

Vassilios Katsouros

George Carayannis

چکیده

This paper presents an algorithm that extracts the tempo of a musical excerpt. The proposed system assumes a constant tempo and deals directly with the audio signal. A sliding window is applied to the signal and two feature classes are extracted. The first class is the log-energy of each band of a mel-scale triangular filterbank, a common feature vector used in various MIR applications. For the second class, a novel feature for the tempo induction task is presented; the strengths of the twelve western musical tones at all octaves are calculated for each audio frame, in a similar fashion with Pitch Class Profile. The timeevolving feature vectors are convolved with a bank of resonators, each resonator corresponding to a target tempo. Then the results of each feature class are combined to give the final output. The algorithm was evaluated on the popular ISMIR 2004 Tempo Induction Evaluation Exchange Dataset. Results demonstrate that the superposition of the different types of features enhance the performance of the algorithm, which is in the current state-of-the-art algorithms of the tempo induction task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

centre for digital music Context-dependent beat tracking of musical audio

We present a simple and efficient method for beat tracking of musical audio. With the aim of replicating the human ability of tapping in time to music, we formulate our approach using a two state model. The first state performs tempo induction and tracks tempo changes, while the second maintains contextual continuity within a single tempo hypothesis. Beat times are recovered by passing the outp...

متن کامل

Texture features for the reproduction of the perceptual organization of sound

Human categorization of sound seems predominantly based on sound source properties. To estimate these source properties we propose a novel sound analysis method, which separates sound into different sonic textures: tones, pulses, and broadband noises. The audible presence of tones or pulses corresponds to more extended cochleagram patterns than can be expected on the basis of correlations intro...

متن کامل

Mirex 2011: Audio Tag Classification Using Weighted-vote Nearest Neighbor Classification

In this long abstract, we present an algorithm for automatically annotating music with tags that is fast, scalable and relatively easy to implement. It uses acoustic similarity for propagating tags among audio items. The algorithm makes use of a variety of acoustical features, ranging from spectral features, to rhythm, tonal and highlevel features (such as mood, genre, gender). These features a...

متن کامل

Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs for Speech Recognition

Recently, we have proposed an unsupervised filterbank learning model based on Convolutional RBM (ConvRBM). This model is able to learn auditory-like subband filters using speech signals as an input. In this paper, we propose two-layer Unsupervised Deep Auditory Model (UDAM) by stacking two ConvRBMs. The first layer ConvRBM learns filterbank from speech signals and hence, it represents early aud...

متن کامل

Audio Visual Speech Enhancement

This thesis presents a novel approach to speech enhancement by exploiting the bimodality of speech production and the correlation that exists between audio and visual speech information. An analysis into the correlation of a range of audio and visual features reveals significant correlation to exist between visual speech features and audio filterbank features. The amount of correlation was also...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Tempo Induction Using Filterbank Analysis and Tonal Features

نویسندگان

چکیده

منابع مشابه

centre for digital music Context-dependent beat tracking of musical audio

Texture features for the reproduction of the perceptual organization of sound

Mirex 2011: Audio Tag Classification Using Weighted-vote Nearest Neighbor Classification

Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs for Speech Recognition

Audio Visual Speech Enhancement

عنوان ژورنال:

اشتراک گذاری